NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Sampling and Integration of Logconcave Functions by Algorithmic Diffusion

Kook, Yunbum; Vempala, Santosh S (June 2025, STOC 2025)

We study the complexity of sampling, rounding, and integrating arbitrary logconcave functions given an evaluation oracle. Our new approach provides the first complexity improvements in nearly two decades for general logconcave functions for all three problems, and matches the best-known complexities for the special case of uniform distributions on convex bodies. For the sampling problem, our output guarantees are significantly stronger than previously known, and lead to a streamlined analysis of statistical estimation based on dependent random samples.
more » « less
Free, publicly-accessible full text available June 9, 2026
In-and-Out: Algorithmic Diffusion for Sampling Convex Bodies

Kook, Yunbum; Vempala, Santosh S (December 2024, Neurips 2024)

We present a new random walk for uniformly sampling high-dimensional convex bodies. It achieves state-of-the-art runtime complexity with stronger guarantees on the output than previously known, namely in Rényi divergence (which implies TV, KL etc.). The proof departs from known approaches for polytime algorithms for the problem - we utilize a stochastic diffusion perspective to show contraction to the target distribution with the rate of convergence determined by functional isoperimetric constants of the stationary density.
more » « less
Full Text Available
Gaussian Cooling and Dikin Walks: The Interior-Point Method for Logconcave Sampling

Kook, Yunbum; Vempala, Santosh S (July 2024, Conference on Learning Theory 2024)

The connections between (convex) optimization and (logconcave) sampling have been considerably enriched in the past decade with many conceptual and mathematical analogies. For instance, the Langevin algorithm can be viewed as a sampling analogue of gradient descent and has condition-number-dependent guarantees on its performance. In the early 1990s, Nesterov and Nemirovski developed the Interior-Point Method (IPM) for convex optimization based on self-concordant barriers, providing efficient algorithms for structured convex optimization, often faster than the general method. This raises the following question: can we develop an analogous IPM for structured sampling problems? In 2012, Kannan and Narayanan proposed the Dikin walk for uniformly sampling polytopes, and an improved analysis was given in 2020 by Laddha-Lee-Vempala. The Dikin walk uses a local metric defined by a self-concordant barrier for linear constraints. Here we generalize this approach by developing and adapting IPM machinery together with the Dikin walk for poly-time sampling algorithms. Our IPM-based sampling framework provides an efficient warm start and goes beyond uniform distributions and linear constraints. We illustrate the approach on important special cases, in particular giving the fastest algorithms to sample uniform, exponential, or Gaussian distributions on a truncated PSD cone. The framework is general and can be applied to other sampling algorithms.
more » « less
Full Text Available
Sampling Polytopes with Riemannian HMC: Faster Mixing via the Lewis Weights Barrier

Gatmiry, Khashayar; Kelner, Jon; Vempala, Santosh S (July 2024, Conference on Learning Theory 2024)

Full Text Available
Sampling Polytopes with Riemannian HMC: Faster Mixing via the Lewis Weights Barrier

Gatmiry, Khashayar; Kelner, Jonathan; Vempala, Santosh S (July 2024, Conference on Learning Theory 2024)

Full Text Available
Calibrated Language Models Must Hallucinate

https://doi.org/10.1145/3618260.3649777

Kalai, Adam Tauman; Vempala, Santosh S (June 2024, ACM)

Recent language models generate false but plausible-sounding text with surprising frequency. Such “hallucinations” are an obstacle to the usability of language-based AI systems and can harm people who rely upon their outputs. This work shows that there is an inherent statistical lower-bound on the rate that pretrained language models hallucinate certain types of facts, having nothing to do with the transformer LM architecture or data quality. For “arbitrary” facts whose veracity cannot be determined from the training data, we show that hallucinations must occur at a certain rate for language models that satisfy a statistical calibration condition appropriate for generative language models. Specifically, if the maximum probability of any fact is bounded, we show that the probability of generating a hallucination is close to the fraction of facts that occur exactly once in the training data (a “Good-Turing” estimate), even assuming ideal training data without errors. One conclusion is that models pretrained to be sufficiently good predictors (i.e., calibrated) may require post-training to mitigate hallucinations on the type of arbitrary facts that tend to appear once in the training set. However, our analysis also suggests that there is no statistical reason that pretraining will lead to hallucination on facts that tend to appear more than once in the training data (like references to publications such as articles and books, whose hallucinations have been particularly notable and problematic) or on systematic facts (like arithmetic calculations). Therefore, different architectures and learning algorithms may mitigate these latter types of hallucinations.
more » « less
Full Text Available
Sampling Polytopes with Riemannian HMC: Faster Mixing via the Lewis Weights Barrier

Gatmiry, Khashayar; Kelner, Jonathan A; Vempala, Santosh S (June 2024, Proceedings of Machine Learning Research)

Full Text Available
Computation with Sequences of Assemblies in a Model of the Brain

Dabagia, Max; Papadimitriou, Christos; Vempala, Santosh S (March 2024, Algorithmic Learning Theory 2024)

Even as machine learning exceeds human-level performance on many applications, the generality, robustness, and rapidity of the brain’s learning capabilities remain unmatched. How cognition arises from neural activity is the central open question in neuroscience, inextricable from the study of intelligence itself. A simple formal model of neural activity was proposed in Papadimitriou (2020) and has been subsequently shown, through both mathematical proofs and simulations, to be capable of implementing certain simple cognitive operations via the creation and manipulation of assemblies of neurons. However, many intelligent behaviors rely on the ability to recognize, store, and manipulate temporal sequences of stimuli (planning, language, navigation, to list a few). Here we show that, in the same model, time can be captured naturally as precedence through synaptic weights and plasticity, and, as a result, a range of computations on sequences of assemblies can be carried out. In particular, repeated presentation of a sequence of stimuli leads to the memorization of the sequence through corresponding neural assemblies: upon future presentation of any stimulus in the sequence, the corresponding assembly and its subsequent ones will be activated, one after the other, until the end of the sequence. If the stimulus sequence is presented to two brain areas simultaneously, a scaffolded representation is created, resulting in more efficient memorization and recall, in agreement with cognitive experiments. Finally, we show that any finite state machine can be learned in a similar way, through the presentation of appropriate patterns of sequences. Through an extension of this mechanism, the model can be shown to be capable of universal computation. We support our analysis with a number of experiments to probe the limits of learning in this model in key ways. Taken together, these results provide a concrete hypothesis for the basis of the brain’s remarkable abilities to compute and learn, with sequences playing a vital role.
more » « less
Full Text Available
Computation with Sequences of Assemblies in a Model of the Brain

Dabagia, Max; Papadimitriou, Christos H; Vempala, Santosh S (March 2024, Proceedings of Machine Learning Research)

Full Text Available
Computation with Sequences of Assemblies in a Model of the Brain

Dabagia, Max; Papadimitriou, Christos H; Vempala, Santosh S (February 2024, International Conference on Algorithmic Learning Theory)

Full Text Available

« Prev Next »

Search for: All records